Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/snap-update-ns: keep order of existing mount entries #14494

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

zyga
Copy link
Contributor

@zyga zyga commented Sep 12, 2024

This is stacked on top of #14491

We recently realized that the order of saved mount profile, after
performing mount namespace update, was subtly incorrect. The order was
seemingly swapped after each operation. In reality, the order reflected
the order of performed changes, including the special change type of
"keep", which means that an entry is kept intact.

Fix the problem by storing "kept" changes in the original order, which
is back-to-front compared to the order of changes computed by the
algorithm. At the same time, keep track of mount changes in the order of
their execution.

This probably fixes a hell lot of edge cases and weird behaviors that
went undiagnosed for many years. Interestingly a test case had a TODO
that hinted at the problem, but it was not clear what the problem was at
the time. The TODO is now gone and the test case is now up-to-date.

Test data for cmd/snap-update-ns/testdata was re-generated.

Jira: https://warthogs.atlassian.net/browse/SNAPDENG-31644

Signed-off-by: Zygmunt Krynicki [email protected]

Snapd is affected by several bugs related to content-layout interaction.
There's a very popular scheme where a snap has two interacting elements
that only work in one special edge case and break regularly:

- A content plug is used to receive content of another snap, typically
  under $SNAP/foo somewhere. Where foo exists and is not created with a
  mimic.
- A layout entry is used to "distribute" $SNAP/foo to /usr/share/foo or
  other locations.

This particular arrangement is used to share essential libraries from a
common "runtime" snap which is not a dedicated base. Several basic tests
explore the actual behavior in the following situations.

- The content is initially connected and a mount namespace is
  constructed. This is the only correct behavior show in this group of
  tests. This is known as test case 1.
- The content is initially connected and a mount namespace is
  constructed. The content is then disconnected. This shows several
  anomalies that are referenced in the code (ordering bug, propagation
  bug). This is known as test case 2.
- The content is initially connected and a mount namespace is
  constructed. The content is then disconnected and re-connected.
  This is known as test case 3.
- The content is initially disconnected and a mount namespace is
  constructed. The content is then connected.
  This is known as test case 4.
- The content interface is initially connected and a mount namespace is
  constructed. The content snap is then refreshed.
  This is known as test case 5.
- The content interface is initially connected and a mount namespace is
  constructed. The application snap is then refreshed.
  This is known as test case 6.

This is related to issues:
- https://warthogs.atlassian.net/browse/SNAPDENG-31644 (flip flop)
- https://warthogs.atlassian.net/browse/SNAPDENG-31645 (propagation)

A support makefile and a set of data files is provided to re-generate
the test data once snapd is improved.

Signed-off-by: Zygmunt Krynicki <[email protected]>
Signed-off-by: Zygmunt Krynicki <[email protected]>
@zyga zyga marked this pull request as ready for review September 12, 2024 11:16
We recently realized that the order of saved mount profile, after
performing mount namespace update, was subtly incorrect. The order was
seemingly swapped after each operation. In reality, the order reflected
the order of performed changes, including the special change type of
"keep", which means that an entry is kept intact.

Fix the problem by storing "kept" changes in the original order, which
is back-to-front compared to the order of changes computed by the
algorithm. At the same time, keep track of mount changes in the order of
their execution.

This probably fixes a hell lot of edge cases and weird behaviors that
went undiagnosed for many years. Interestingly a test case had a TODO
that hinted at the problem, but it was not clear what the problem was at
the time. The TODO is now gone and the test case is now up-to-date.

Test data for cmd/snap-update-ns/testdata was re-generated.

Jira: https://warthogs.atlassian.net/browse/SNAPDENG-31644

Signed-off-by: Zygmunt Krynicki <[email protected]>
@zyga zyga force-pushed the fix/order-of-saved-profile-SNAPDENG-31644 branch from 15335e7 to aaceabb Compare September 12, 2024 13:14
@@ -65,6 +65,10 @@ type Change struct {
Action Action
}

func (c *Change) GoString() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's useful for printing changes. It's not use here but really valuable in practice.

@@ -93,8 +93,14 @@ func executeMountProfileUpdate(upCtx MountProfileUpdateContext) error {
// Compute the new current profile so that it contains only changes that were made
// and save it back for next runs.
var currentAfter osutil.MountProfile
for i := len(changesMade) - 1; i >= 0; i-- {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to leave a comment for ourselves that the initial state was initially iterated over from back to front, so the things we keep are also present in reverse order. Thus order to preserve the stable state for kept entries we should record them in reverse order.

Copy link
Contributor Author

@zyga zyga Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update this to make it clearer what is going on and add an explicit test. Expect this in the rebase after the base branch lands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed now

The new function takes the previously existing algorithm out of
executeMountProfileUpdate, as the original location makes it very tricky
to test due to the presence of all the system interactions through
change.Perform.

Add some simple tests that show how keep, mount and unmount changes are
preserved.

Signed-off-by: Zygmunt Krynicki <[email protected]>
Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I said looking at how NeededChanges works at the moment it appears this will mostly affect the order of unmounting of things we kept that we cannot keep anymore the next time around, it looks the right direction but it's hard to predict the general effects, as the logic in NeededChanges is helpful to an extent but as we know has limitations and is hard to predict atm

return upCtx.SaveCurrentProfile(&currentAfter)
}

func CurrentProfileFromChangesMade(changes []*Change) osutil.MountProfile {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs a comment of its own

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ernestl ernestl added this to the 2.66 milestone Sep 19, 2024
Copy link
Contributor

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants